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Abstract 



Simple two-state folding kinetics of many small single-domain proteins are characterized 
by chevron plots with linear folding and unfolding arms consistent with an apparent 
two-state description of equilibrium thermodynamics. This phenomenon is hereby rec- 
ognized as a nontrivial heteropolymer property capable of providing fundamental insight 
into protein energetics. Many current protein chain models, including common lattice 
and continuum Go models with explicit native biases, fail to reproduce this generic pro- 
tein property. Here we show that simple two-state kinetics is obtainable from models 
with a cooperative interplay between core burial and local conformational propensities 
or an extra strongly favorable energy for the native structure. These predictions suggest 
that intramolecular recognition in real two-state proteins is more specific than that en- 
visioned by common Go-like constructs with pairwise additive energies. The many-body 
interactions in the present kinetically two-state models lead to high thermodynamic co- 
operativity as measured by their van't Hoff to calorimetric enthalpy ratios, implying that 
the native and denatured conformational populations are well separated in enthalpy by a 
high free energy barrier. It has been observed experimentally that deviations from Arrhe- 
nius behavior are often more severe for folding than for unfolding. This asymmetry may 
be rationalized by one of the present modeling scenarios if the effective many-body co- 
operative interactions stablizing the native structure against unfolding is less dependent 
on temperature than the interactions that drive the folding kinetics. 
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INTRODUCTION 



A logical test of any general conception about the driving forces in protein fold- 
ing is to ascertain whether polymer models incorporating the given idea can predict 
generic behavior of real proteins. 1 ' 2 In these considerations, self-contained polymer mod- 
els with explicit chain representations 3 are of particular importance. Quite obviously 
the relationship between model energetics and conformational distribution can only be 
addressed in a physically plausible manner when chain connectivity and excluded volume 
are adequately taken into account. Using this analytical framework, we found that even 
mundane protein properties such as calorimetric two-state cooperativity 4-6 and simple 
two-state folding/unfolding kinetics 7 ' 8 are remarkable feats from a polymer standpoint. 
Simply put, it is nontrivial to construct heteropolymer models with commonly used 
model interaction schemes to reproduce such behavior. These include popular 2-, 3-, 
20-letter models and traditional Go models (see below). 4-8 Generic protein properties 
thus present severe constraints on modeling. Hence, insight into real protein energetics 
can be gained by requiring self-contained polymer models to satisfy such constraints. 

Motivated by the proposed consistency principle 9 or principle of minimal frustration 10 
for protein energetics, Go 11 and Go-like models (see, e.g., refs. 8, 12-16 and references 
therein) have long been used in protein folding investigations. These models postulate 
that only intrachain interactions found in the native (ground-state) conformation are 
favorable, all other possible intrachain interactions are assumed to be either neutral or 
unfavorable. Recently, this native-centric approach to modeling has often been justified 
as well by the discovery that folding rates of natural small single-domain proteins are 
well correlated with the contact order 17 of their native structures. How well do com- 
mon Go models mimic the generic properties of small single-domain proteins? For chain 
models configured on two-dimensional square lattices, we found that even with their ex- 
plicit native biases, the common Go interaction scheme falls far short of producing the 
type of calorimetric two-state cooperativity observed for many small proteins. 4 Three- 
dimensional Go-like lattice 5-7 and continnum 8 models are more proteinlike in this regard, 
as many of them may be considered calorimetrically cooperative if certain lattitude is 
allowed for empirical baseline subtractions. 5 
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However, common Go-like schemes are apparently not capable of producing sim- 
ple two-state folding/unfolding kinetics with linear chevron plots. 18 This we have re- 
cently demonstrated in several examples, 7 ' 8 including lattice and continuum (off-lattice) 
models as well as models with a rudimentary implicit-solvent treatment of desolvation 
barriers. 8 ' 19 Thus, the inability of common Go-like constructs to predict simple two-state 
folding/unfolding kinetics is not an artifact restricted only to lattice Go models. Most 
likely, it is a fundamental problem arising from the additive nature of common Go-like 
interaction schemes. Our results indicate strongly that such model interaction schemes 
- on-lattice and otherwise — afford insufficient cooperativity to capture real two-state 
protein energetics. Here we address this basic question by constructing and testing novel 
native-centric lattice models that go beyond common additive Go-like schemes, with 
intrachain potentials that can lead to simple two-state folding/unfolding kinetics. The 
ultimate goal of this line of inquiry, of which the present lattice exercise is only a first 
step, is to decipher the many-body cooperative interactions underlying the behavior of 
small single-domain proteins. 

MODELING A COOPERATIVE INTERPLAY BETWEEN LOCAL 
CONFORMATIONAL PREFERENCE AND PROTEIN CORE 
FORMATION 

We first explore several native-centric variants of a 55mer model (Figures 1-4). Their 
basic features are derived from the original model we put forth recently. 6 For the Go-like 
constructs studied here, we retain contributions from the term disfavoring the initiation 
of left-handed helices (equation 1 of ref. 6). Contributions from the 5- letter contact 
energies are retained for the native contacts in the ground-state conformation, whereas 
all nonnative contacts are assigned zero energy as in common Go models. The resulting 
native-centric model has the energy function 

E = S contact + TlhMh , (1) 

where the prime superscript on the -^contact term indicates that the sum of pairwise 5- 
letter energies is restricted to native contacts, and the second term on the right disfavors 
left-handed helices. Here we use the same contact energies and 7ih parameter as in ref. 6. 
The ground-state energy of the present model equals —36.1. We refer to this as model (i). 
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Next we consider a model that embodies the idea of a cooperative interplay between 
local conformational preference and the (nonlocal) packing of the protein core. 4 ' 6,7 This 
is motivated by the observation that secondary structure formation in globular proteins 
is often context dependent, and that short helices are often not stable in isolation but 
are stable when packed in the core of a protein (see, e.g., ref. 20 and references therein). 
Here we mimic this effect by assigning a favorable energy S coop to each incidence of the 
conformational situation described in Figure 1, leading to the energy function 

E = ^ contact + 7lh^lh + S coop h c , (2) 

where h c counts the incidences of a fully formed native helix having the cooperative 
interactions defined by Figure 1, and < h c < 4 for the present 55mer model. This we 
refer to as model (ii). 

To account for the possibility that optimal packing of the protein core as a whole 
can impart significant thermodynamic stability to the native structure, we consider yet 
another model with an extra favorable energy assigned only to the native conformation. 
The energy function now becomes 

E = Contact + 7lhiVlh + S coop h c + E gs , (3) 

where the augmented E gs term takes a nonzero favorable value only when the chain 
is in its unique ground-state conformation. We refer to this as model (iii). We note 
that the S coop and E gs terms introduced in equations 2 and 3 are many-body in nature. 
Many-body interactions have been investigated in the context of protein folding (see, 
e.g., refs. 6, 21, 22). However, their relationship with linear chevron plots and simple 
two-state kinetics has not been much explored. 

Figure 2 shows that the many-body cooperative interactions introduced above en- 
hance thermodynamic cooperativity. In calculating the heat capacities of these models, 
we made the simplifying assumption that the interactions are temperature independent, 
and set enthalpy equal to the model energy, as in our previous investigations. 5-8 For 
calorimetric two-state behavior, the van't Hoff to calorimetric enthalpy ratio AH v n/AH ca \ 
has to be close to one. 4-6 Now the AH vil / AH cal ratio (/t 2 in ref. 5 without empirical 
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baseline subtraction) equals 0.804 for model (i) which does not contain many-body co- 
operative interactions. But it is considerably higher at 0.88 and 0.91, respectively, for 
models (ii) and (iii) with favorable values of £ coop and E gs (Figure 2, upper panel).* 
This is not too surprising because the many-body cooperative interactions defined above 
tend to increase the energetic (enthalpic) separation between the ground-state and near 
ground-state conformations on one hand and the open unfolded conformations on the 
other, or disfavor conformations with intermediate energy (enthalpy), or both. Both 
of these effects would lead to higher calorimetric cooperativity. 4 ' 5 The thermodynamic 
ramifications of these interactions are further explored in the lower panel of Figure 2, 
which covers a broad range of values for £ COO p and E gs . An additional scenario in which 
an extra favorable energy for the ground-state is augmented to model (i), viz., 

E = Contact + lM h + E gs (4) 

is also studied [curve (a)]. The trend observed in the lower panel of Figure 2 is that 
stronger many-body cooperative interactions of the type defined above generally lead to 
higher calorimetric cooperativity. However, there appears to be an upper limit on /t 2 
0.96) achievable by the helix-packing term alone [curve (b)], because at very high 
— £ C oo P values it is possible that some intermediate non-ground-state conformations can 
become relatively stable (c.f. Figure 1). 

Figure 3 presents the chevron plots for the three models considered in the upper panel 
of Figure 2. To model folding and unfolding kinetics at different interaction strengths, 
an energetic scaling parameter e is introduced. At a given e, the effective energy of a 
conformation with energy E (given by equations 1, 2 or 3) is equal to — eE; and varia- 
tion in e/ksT (at constant T) serves as a model denaturant concentration variation, as 
in ref. 7. Figure 3 shows that at sufficiently strong intrachain interaction (more negative 
e/ksT), every folding arm of the three chevron plots exhibits a rollover. This suggests 
that chevron rollover is practically unavoidable in polymer models with physically plau- 
sible interactions, because when intrachain interactions become generally very favorable, 
kinetic trapping is bound to increase in importance. 7,8 However, native thermodynamic 
stability would be extremely high when the model parameter ejk^T becomes extremely 
negative. Many such situations are not physically realizable in real proteins, 7 whose 

*For every model considered in Figure 2, the (ref. 5) value for the Ai? vH /Ai? cal ratio after 
cmpricial baseline subtraction equals 1.0. 
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native stabilities even in zero denaturant are often marginal. In this light, in comparing 
Figure 3 with experiments, the relevant question is whether there is a quasi-linear regime 
of the model chevron plots that is consistent with the two-state thermodynamics of the 
given model and covers a range of thermodynamic stability similar to that of real, simple 
two-state proteins. 

Pursuing this logic, we note that folding rollover occurs quite near to the transition 
midpoint for model (i), but the e/k B T range of a quasi-linear regime is more extended 
for the two more cooperative models (ii) and (iii). The folding arms of models (ii) and 
(hi) are identical because, by construction, while the E gs term in equation 3 slows un- 
folding, it does not affect the kinetics of folding. For model (iii), we have used standard 
histogram techniques and extensive conformational sampling 7 at e/k-^T = —2.105 to 
determine the dependence of the free energy of unfolding AG U on e/k B T (detailed data 
not shown). The resulting thermodynamic relation, which is essentially linear and has 
approximately the same transition midpoint as determined from the kinetic chevron plot, 
was applied to construct the dotted V-shape in Figure 3. The close agreement between 
the dotted V-shape and the simulated chevron plot for model (iii) from e/ksT xs —2.3 
to —1.6 implies that the folding/unfolding kinetics of model (iii) is consistent with a 
simple two-state description within this regime. The strongest intrachain interaction 
in the two-state regime is at e/k-^T ~ —2.31, which corresponds to a native stability 
AG U ~ lOksT for this particular model. It is clear from comparing the chevron plots 
of models (ii) and (iii) that the linear regime can be readily extended by increasing the 
magnitude of E gs beyond that in model (iii). But even as it stands, model (iii)'s behavior 
in Figure 3 should provide a semi-quantitative rationalization for the simple two-state 
kinetics of many small single-domain proteins. Indeed, more than half of the 24 two-state 
proteins listed by Plaxco et al. (2000) 2 have native stabilities around 25°C comparable 
to or lower than lOk^T. For example, AG U = 3.6k^T (2.1 kcal/mol) for CspB at 25°C 
and pH 7.0 (ref. 23), and AG U = 9.0£;bT (5.3 kcal/mol) for protein L at 22°C and pH 
7.0 (ref. 24). In our view, therefore, simple two-state folding/unfolding kinetics emerges 
as a limiting-case phenomenon when the hypothetically high native stability at which 
chevron rollover would occur is not attainable by a small single-domain protein. Con- 
versely, rollover becomes observable when a protein fails to achieve a sufficiently high 
thermodynamic cooperativity commensurate with its native stability. 
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A comparison between the kinetic properties in Figure 3 and the thermodynamic 
properties in Figure 2 indicates that the thermodynamic requirement of simple two- 
state behavior is stringent, allowing only for a small adjustment from empirical baseline 
subtraction. 5 Apparently, a model has to be nearly or as cooperative as model (iii) or 
more to achieve a reasonable reproduction of simple two-state protein folding/ unfolding 
kinetics. This suggests strongly that, in modeling situations when the heat capacity con- 
tributions from bond vibrations are not considered (as in the present cases) and intra- 
chain interaction energies are taken to be temperature- independent, a without-baseline- 
subtraction AH v u/ AH ca \ ratio of n 2 > 0.9 would likely be required for simple two-state 
kinetics [Figure 2, upper panel, model (hi)]. 

We have also checked that folding kinetics is essentially single exponential within 
the quasi-linear regime by confirming that the logarithmic distribution of folding first 
passage times is approximately linear. 7,8 ' 25 In fact, extensive testing for ten values of 
e/k B T between —2.22 and —2.78 covering the quasi-linear regime and beyond indicates 
that they are consistent with single exponential relaxation. Unfolding kinetics is essen- 
tially single-exponential as well (detailed data not shown). As in our previous study, 7 
the onset of non-exponential folding relaxation at interaction strength ejk^T m —2.9 is 
concomitant to that of a drastic chevron rollover. 

Not surprisingly, Figure 4a shows that the free energy barrier separating the native 
and denatured states is higher for a more cooperative model, consistent with its slower 
folding and unfolding rates at the transition midpoint (c.f. Figure 3). Figure 4b shows 
that the relation between energy and the number of native contacts are approximately 
linear for the two cooperative models. In this regard, the present exercise suggests 
that certain many-body interactions embodying a local-nonlocal cooperative interplay 
(£ C oo P = —1-0) and an added ground-state stability (E gs = —2.0) in proteins can lead 
to remarkable improvements in kinetic cooperativity even when the magnitudes of these 
terms are relatively small. 
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MODELING A PARTIAL SEPARATION BETWEEN THE 
INTERACTIONS FOR THERMODYNAMIC STABILITY AND THE 
DRIVING FORCES FOR FOLDING KINETICS 

Having established a plausible scenario for simple two-state protein folding/ unfolding 
kinetics, we proceed to broaden our exploration and to better delineate how various 
energetic components might contribute to this remarkable behavior. As a first step, we 
consider in this section a somewhat different class of models in which a local-nonlocal 
cooperative interplay is absent but the unique ground-state conformation is favored by 
an extra strong energy. The interaction scheme is a simplified version of equation 4 
above, with the energy function 

E = E G o + E gs , (5) 

where E G o is the usual lattice Go potential that assigns a favorable energy e (< 0) to 
every contact in the native structure of the model and assigns zero energy to all other 
(nonnative) contacts, and E gs applies only to the ground-state conformation, as in equa- 
tions 3 and 4. To explore the effect of native topology, we study three cubic-lattice 27mer 
models with E given by equation 5 for three different ground-state structures (Figures 5 
and 6). 

As discussed above, the E gs term serves to increase native stability and enhance 
thermodynamic stability, leading to a reduced unfolding rate. But it has no effect on 
the folding kinetics modeled by Monte Carlo dynamics with the Metropolis acceptance 
criterion. 26 ' 27 Thus, the energetics described by equation 5 entails a partial separation 
between the interactions that drive the protein to fold kinetically (the pairwise contact 
energies -Egs) and the interactions that stabilize the ground-state structure (Eq^ and 
the many-body -Eg S ). Since Eq^ contributes partially to native stability, the role separa- 
tion just described becomes a more predominant feature of the model when E gs is large 
compare to e . A similar mechanism of partial separation between folding-kinetics and 
native-stabilizing interactions is also presumed by equation 3 [model (iii)] and equation 4 
above. Our interest in this scenario was partly motivated by experimental studies show- 
ing that mutants of a wildtype protein are much less likely to have a slower unfolding 
rate than to have a faster folding rate. For example, among the 41 mutants of Fyn SH3 
domain studied by Northey et al., 28 only 3 have slightly reduced unfolding rates relative 
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to that of the wildtype, whereas five times as many (15) mutants have folding rates 
faster than that of the wildtype. This means that interactions that accelerate folding 
do not necessarily lead to higher native stability (only 4 mutants are more stable than 
the wildtype), presumably because some mutants that fold fast do not pack well when 
folded 28 (c.f. discussion of conformational strain by Ventura et al. 29 ). These observations 
suggest that a partial separation of folding-kinetics and native-stabilizing intraprotein 
interactions as envisioned by the E gs term is physically plausible. 

Figure 5 shows that combining a pairwise Go potential with an E gs term can also lead 
to simple two-state protein folding/ unfolding kinetics, although for these relatively short 
chains the E gs /e ratio needed to achieve simple two-state behavior is large. Figure 5a 
provides a series of unfolding chevron arms for different E gs values, showing clearly that 
the quasi-linear regime of the chevron plot can be extended by a more negative E gs . For 
all three models considered in Figure 5 with E gs /eo = 14, approximately simple two-state 
behavior persists to e/k B T —1.15 (c.f. simulated chevron plots and dotted V-shapes), 
corresponding to native stabilities AG U rs 10k B T. As for model (iii) above (Figure 3), for 
each model in Figure 5, we have verified that folding relaxation is essentially single expo- 
nential for e/kftT > —1.6 by obtaining linear logarithmic first passage time distributions 
for six e/k B T values from —1.67 to —0.91. Unfolding relaxation is also essentially single- 
exponential (detailed data not shown). Folding relaxation becomes non-exponential (for 
e/k B T < —1.8) only when native stability is much higher than that spanned by the 
simple two-state regime between e/k B T ps —1.15 and e/k B T pa —0.723. 

Figure 6 compares chevron plots of the three cooperative 27mer models. The rank 
ordering of their folding rates is consistent with a correlation between slower folding 
rate and higher relative contact order (CO). 17 However, for these models, the depen- 
dence of folding rate on CO is weak. Near the onset of drastic chevron rollover and 
non-exponential folding relaxation (e/k B T ^ —1.75), the 27mer model with CO = 0.51 
folds only approximately 4 times slower than the 27mer model with CO = 0.28. The 
dispersion in folding rate is even smaller within the simple two-state regime. This is a 
far cry from the six orders of magnitude of variation in folding rates observed among 
real, small, single-domain proteins. 2 Recently, CO-dependent folding rates have been 
addressed using explicit-chain models with limited yet encouraging successes. Using a 
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Go-like potential for 18 small single- domain proteins, Koga and Takada 31 obtained a 
correlation between CO and folding rate, but the variation in rates covered only 1.5 
orders of magnitude. More recently, Jewett et al. 30 conducted an extensive lattice 27mer 
simulation study using cooperative Go-like models with a nonlinear E-Q relation. A 
correlation between CO and folding rate was found but again the dispersion in folding 
rates spanned only 1 to 1.5 order of magnitude. While the mechanisms and energetics 
of CO-dependent folding remain to be better elucidated, 32 our more recent investigation 
shows that models with a local-nonlocal cooperative interplay similar to that in models 
(i) and (ii) (equations 2 and 3) above can lead to a relatively large dispersion in folding 
rates and a better correlation between CO and folding rate. 33,34 



A RATIONALIZATION OF NON-ARRHENIUS PROTEIN 
FOLDING/UNFOLDING KINETICS 



The physics embodied by the extra favorable ground-state energy E gs in the mod- 
els described by equations 3-5 above implies that there is a fundamental asymmetry 
between folding and unfolding kinetics. 28 This led us to ask whether the same physical 
picture may shed light on the significant difference in the degree of deviation from Arrhe- 
nius kinetics for folding versus unfolding that are often observed in experiments. Early 
measurements by Segawa and Sugihara 35 showed that the folding kinetics of hen egg- 
white lysozyme was significantly non-Arrhenius (logarithmic folding rate In kf nonlinear 
in 1/T) whereas the unfolding kinetics was essentially Arrhenius (logarithmic unfold- 
ing rate lnk u linear in 1/T). Table I summarizes more recent experimental data from 
the literature for several proteins with simple two-state folding/unfolding kinetics and 
whose temperature-dependent rates of both folding and unfolding have been measured 
directly. For the proteins listed, the trend that folding is more non-Arrhenius than un- 
folding is quantified by reported activation heat capacities for folding (ACj)f that are 
significantly larger in magnitude than the corresponding activation heat capacity for un- 
folding (AC|) U . Table I puts the "(ACj)f/(A(7j) u " ratio in quotation marks because 
the common approach of using temperature-independent activation heat capacities to 
analyze folding/unfolding kinetics data may be problematic. 17 We note that another 
potential source of the difficulty is that possible temperature dependencies of the heat 
capacities 39 associated with protein folding/ unfolding transitions were not considered in 
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such analyses. Nonetheless, as an empirical parameter, "(AC|)f/(AC*) u " serves well to 
demonstrate that ln/cf is often significantly more curvilinear in 1/T than \nk u . 

This trend can be captured qualitatively in the present modeling context if the 
solvent-mediated driving forces for folding kinetics and the many-body native-stabilizing 
interactions are taken to have different temperature dependencies. This is a physically 
plausible assumption because some intraprotein solvent-mediated forces such as the hy- 
drophobic effect are known to be sensitive to the sizes and shapes of the interacting 
groups. 40-42 Here we use a collection of cooperative 27mer models in Figure 5a with 
different values of E gs to expound the principles involved. Temperature-dependent in- 
teractions are now introduced by letting the pairwise contact energy e to vary with 
temperature in a hydrophobic-like manner while leaving E gs temperature- independent. 

The following schematic analysis of T-dependent folding and unfolding rates (Fig- 
ure 7) is similar to that introduced by Chan 26 and Chan and Dill. 27 However, the 
present focus on folding-unfolding asymmetry in the context of three-dimensional protein 
chain models was not addressed in these earlier studies of short-chain two-dimensional 
models. 26,27 The first step in the present analysis is to obtain from Figure 5a the loga- 
rithmic folding and unfolding rates In kt and In k u [which are taken to be their respective 
— ln(MFPT)] as functions of e and E gs . Since the effective energy is given by — eE in 
Figure 5 (see above) where E is given by equation 5 with eo set to —1, each (e,E gs )- 
dependent datapoint in Figure 5a may be regarded as the folding or unfolding rate for 
the energy function E itself with an eo value equals to that of e and an E gs value equals e 
times the E gs for the given unfolding chevron arm. We note that within the quasi-linear 
single-exponential regime, 

Info = a t + /3 f -^°— (6) 

holds approximately for constant at and fy, because folding kinetics is independent of 
E gs . A least-square fit yields fit = —15.4. For unfolding within the quasi-linear single- 
exponential regime, the approximate linear relation 

lnK = aa + ^ +K ^L (7) 

is expected, where a u , f3 u , and f3' n are constants. Extensive analyses indicate that f3 u ~ 4.0 
and P' n ~ 1.0. We use f3 u = 3.9, — 1 below. Figure 7a shows that these values fit the 
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simulated unfolding rates of the more cooperative models extremely well. 



Next, an hypothetical temperature-dependent eo = £o(T) is introduced in Figure 7b. 
This functional form for eo/ksT (solid curve, left scale) was motivated by the temperature 
dependence of hydrophobic effects 39 and is similar to that explored in refs. 26 and 27. It 
follows that the temperature-dependent folding rate in the quasi-linear single-exponential 
regime is now given by equation 6 above with eo — > €q(T) provided by Figure 7b, viz., 

lnfe(T) = af + /%|22. (8) 

Similarly, the temperature-dependent unfolding rate in the quasi-linear single-exponential 
regime of an effective E gs = — 14 cooperative model is given by equation 7 above with e 
— > e (T) provided by Figure 7b while E gs remains temperature independent: 

\nk u (T) = a u + P^ + ^, (9) 

where the E gs /ksT term is set equal to —14 for a reference temperature (T*) in Fig- 
ure 7c at which to/k^T = —1. Hence E gs /k-oT = —14(T*/T) is linear in 1/T. These 
temperature-dependent folding and unfolding rates are plotted in the upper part of 
Figure 7c. It is clear that folding is significantly more non-Arrhenius than unfolding be- 
cause the only source of non-Arrhenius behavior in the present formulation of the model 
is e (T), and lnfcf depends more strongly on e (T) (/3f = —15.4) than \nk u (f3 u = 3.9). 

One missing physical ingredient in the consideration thus far is that intrinsic con- 
formational transition rates should accelerate at higher temperature. This is not taken 
into account if physical time is simply identified with number of attempted Monte Carlo 
moves, as in the analysis above. The issue has been identified and addressed in some de- 
tail in refs. 26 and 27. As in these references, Figure 7b introduces an adjustment factor 
(dotted line, right scale) to better mimic physical time. Here A(T) is the temperature- 
dependent time needed for a given kinetic process and A is a reference time. Thus the 
hypothetical — ln[A(T)/A ] function in Figure 7b stipulates that the intrinsic logarithmic 
rate [— ln^4(T)] is higher at higher temperatures (i.e., the Monte Carlo clock should run 
faster at higher T). This adjustment factor is readily incorporated 26 ' 27 by setting 

ln(folding rate) = ln£; f (T) - \n[A(T)/A ] , 
ln(unfolding rate) = lnfc u (T) - ln[A{T)/A ] , 
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where In k{ and In k u on the right hand side are the expressions given respectively by 
equations 8 and 9. Macroscopically, this amounts to introducing an additional enthalpic 
contribution 24,26 to the free energy barrier of protein folding. The lower part of Figure 7c 
shows that incorporating a — lnL4(T)/Ao] term can lead to a more realistic description 
of temperature-dependent protein folding and unfolding rates (c.f. experimental data 
in Figure 7d; see also Figure 3 of ref. 23, Figure 1C of ref. 24, Figure 3 of ref. 36, and 
Figure 5 of ref. 38). 



DISCUSSION: A NEAR-LEVINTHAL SCENARIO FOR SIMPLE 
TWO-STATE PROTEIN FOLDING/UNFOLDING KINETICS 

The results of the present investigation suggest strongly that the physical interac- 
tions underlying the simple two-state folding/unfolding kinetics of small single-domain 
proteins should involve many-body effects beyond that stipulated by common additive 
Go models, even though the physico-chemical origins of these effects remain to be elu- 
cidated. Therefore, with regard to protein thermodynamic and kinetic cooperativity, 
common Go models with pairwise additive contact energies are not ideal. Apparently, 
the type of many-body interactions that are conducive to simple two-state kinetics also 
lead to higher thermodynamic cooperativity, and entail a partial separation between 
folding-kinetics and native-stabilizing interactions. Historically, an impetus to formu- 
late the Levinthal paradox might have been the discovery in the late 1960s that some 
proteins were calorimetrically two-state 43 (see ref. 4 and references therein). Naturally, 
an extreme interpretation of the AH v Yi/AH ca \ = 1 property would imply that only two 
enthalpy levels exist (native and denatured), and thus the landscape should resemble a 
golf course (Figure 8a). However, a golf-course landscape dictates that folding would 
be exceedingly slow, but the folding of real proteins is relatively fast. To address the 
"why is folding fast" question, recent theoretical discussions emphasize the funnel-like 
nature of the protein folding energy landscape as a solution to the Levinthal paradox 44 ' 45 
(Figure 8b); and common Go potentials are often used to model a relatively smooth 
funnel-like energy landscape. 12 We found that calorimetric two-state cooperativity can 
be consistent with the funnel-like landscapes of three-dimensional Go models, provided 
some lattitude is allowed for empirical baseline subtractions. This is because in these 
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native-centric models, the conformational populations with intermediate energies (en- 
thalpies) — though not zero — are relatively low. 5-8 

However, as discussed above, common Go models are insufficient for simple two-state 
protein folding kinetics. 7 ' 8 Because kinetic traps are still significant in these constructs 
under native conditions, the chevron plots they predicted have severe rollovers. 46 An- 
other shortcoming of common Go models is that their predicted folding rates are often 
too fast compared to that of real proteins. 8,14 In one example, it is at least four orders 
of magnitude faster. 8 So, in a sense, in the context of recent Go modeling efforts, the 
critical question has shifted from "why is folding fast" to "why is folding slow." The 
present study concludes that a thermodynamic cooperativity higher than that afforded 
by common additive Go models is necessary for simple two-state kinetics (Figures 2 and 
3). This scenario also offers a clue to the "why is folding slow" question. For real, small, 
single-domain proteins, it appears that the key to avoiding kinetic traps and chevron 
rollover is to have only weakly favorable intrachain interactions during the folding pro- 
cess (gentle upper slopes of the funnel in Figure 8c) until a significant fraction of the 
chain is native-like and ready to come together to form a large number of native contacts 
at once, at which point strong cooperative many-body effects kick in to stabilize the 
structure (steep lower slopes of the funnel in Figure 8c). This idea is implemented in 
the cooperative models studied here. Indeed, in Figures 3 and 5, the quasi-linear folding 
regimes of the cooperative models are in the weakly-interacting unfolding regimes (small 
—e/k^T) of the corresponding additive Go models. Thus, for folding of the cooperative 
models, the energetic bias is not very strong during most of the conformational search. 
This feature serves to diminish the effects of kinetic traps (shallow minima on the gentle 
upper slopes of Figure 8c), because the depths of kinetic traps in heteropolymers are 
often correlated with the overall energetic bias towards the native structure. 27 As a re- 
sult, folding is faster in the cooperative models relative to other heteropolymer models 
with deep kinetic traps. But at the same time, the very feature of a weakened energetic 
bias towards the native structure during most of the conformational search also leads to 
slower folding in comparison with common Go models, because the latter have stronger 
native biases during the corresponding kinetic process. Nonetheless, the reduction in 
folding rate relative to common Go models does not make the cooperative models less 
proteinlike, because even a small bias is sufficient to circumvent the Levinthal paradox, 
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as all of our models fold. In this way, the cooperative model scenario provides a physical 
plausible answer to the "why is folding slow" question posed above. In fact, for real 
proteins, folding rates may be further reduced by the inevitable presence of favorable 
nonnative interactions, which are not taken into account by the native-centric coopera- 
tive models here. Anti-cooperativity of certain hydrophobic interactions 47 may also play 
a role in discouraging premature chain collapse. In this scenario, the energy landscape 
of a simple two-state protein is still funnel-like but with a narrow bottleneck (Figure 8c). 
The resulting highly bimodal distribution of energy and high thermodynamic coopera- 
tivity thus approach (though never equal) that of an hypothetical Levinthal golf course. * 

Part of the present view is similar to that of Jewett et al., 30 who recently introduced a 
native-centric model in which the energy E of a conformation — unlike that in common 
Go models — does not decrease linearly with the number of native contacts. (Confor- 
mations with more negative £"s are more favorable.) In their model, the rate of decrease 
in E with increasing fractional number of native contact Q becomes progressively higher 
as the native structure is approached (when Q — > 1). This means that the energetic 
bias towards the native structure is not strong during the initial stages of folding when 
there are few native contacts (small Q), but becomes stronger when the native structure 
is approached during the final stages of folding (Q — > 1). Insofar as this general trend 
is concerned, the physical picture of cooperative folding discussed above (especially the 
model defined by equation 5) is very much similar to that of Jewett et al. Nevertheless, 
although both the model of Jewett et al. and the present cooperative models have high 
degrees of thermodynamic cooperativity, their underlying mechanisms are not identical. 
More recent investigations indicate that detailed kinetic features, such as the correlation 
between CO and folding rate, do depend significantly on how thermodynamic coopera- 
tivity is achieved microscopically. 33 ' 34 Since the lattice model of Jewett et al. was inspired 
by the more general topomer search model of folding, 48 it would be extremely interesting 
to compare in future investigations the relationships between the topomer search model 

^We emphasize that the relatively smooth funnel drawings in Figure 8 should be viewed only as 
pictorial devices for underscoring the smoothness of the energy landscapes of native-centric models 
relative to that of models with deeper kinetic traps. Even for Go and Godike models, energy landscapes 
cannot be completely smooth because of repulsive interactions (including excluded volume effects) and 
other microscopic energy barriers due to bond rotations and solvation, for example (refs. 8, 26, 27; c.f. 
equation 10). 
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and the several different scenarios of thermodynamic cooperativity explored here. 

As we have emphasized, 4,5 the experimental calorimetric two-state criterion, which 
has proven useful for evaluating protein chain models, 4-8 ' 33 ' 34 ' 49-52 does not imply that 
there are only two infinitely sharp energy (or enthalpy) levels. In other words, the 
calorimetric two-state criterion does not preclude the existence of "partially unfolded" 
conformations with energies intermediate between the energy distribution peaks under 
strongly folding and strongly denaturing conditions (c.f. Figure 16 of ref. 4, Figure 10 
of ref. 5, Figure 1 of ref. 7, and Figures 7-9 of ref. 8). These theoretical findings are 
consistent with native-state hydrogen exchange experiments. 53 Conformational popula- 
tions with intermediate energies in several calorimetrically cooperative models tested 
thus far (see figure references above) share some similarities with that in calorimetrically 
non-cooperative constructs such as certain HP models 4 and a 15mer 20-letter sidechain 
model. 5 ' 54,55 However, the critical difference between calorimetric cooperative and non- 
cooperative models is that, at the transition midpoint, conformational populations with 
intermediate energies are not significant for calorimetrically cooperative models but are 
significant for calorimetrically non-cooperative models. This difference is well charac- 
terized by the AH vi i/ AH ca _\ ratio. 4,5 ' 49-52 Our approach of evaluating chain models by 
experimental cooperativity criteria is designed to address what kind of elementary intra- 
chain interactions may be needed to produce the generic cooperative features of proteins, 
while taking into account as much as possible that proteins are polymers and therefore 
chain connectivity and excluded volume are severe constraints. In this respect, investiga- 
tions using self-contained polymer models such as that conducted here are fundamentally 
more informative than those that use postulated free energy profiles in the absence of 
explicit chain representations (see, e.g., ref. 56). 

In summary, our results suggest that intramolecular recognition in real two-state 
proteins is highly specific. As well, the role of many-body interactions in providing 
a larger average energetic difference between "native" and "denatured" conformations 
than that afforded by common pairwise additive interaction schemes have potentially 
important implications for the discrimination of decoys in protein structure prediction. 57 
In principle, the many-body interactions proposed in the present work can be character- 
ized quantitatively through careful experiments and extensive atomic simulations. How 
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sidechain packing, sidechain/mainchain correlation, 58 and interactions such as hydrogen 
bonding contribute to this mechanism remains to be investigated. To help address these 
questions, the ramifications of the different scenarios explored in this work need to be 
first delineated in greater detail. 33 ' 34 Lattice model studies of Klimov and Thirumalai 
showed that sidechain degrees of freedom increase the sharpness of the thermodynamic 
folding/unfolding transition relative to that of the corresponding (mainchain) model 
with no sidechains. 54 However, their short-chain 20-letter sidechain models configured 
on three-dimensional simple cubic lattices do not appear to be calorimetrically cooper- 
ative. The AH vi i/ AH ca \ ratio of one of the sidechain sequences studied in refs. 54 and 
55 is equal to k 2 = 0.38 without baseline subtraction, and increases only to k 2 = 0.54 
after reasonable baseline subtractions. 5 These values are far from the Aif vH /Aif cal 1 
required for calorimetrically cooperative behavior. Kinetically, even a Go-like version 
of their sidechain model exhibits a severe chevron rollover (Figure 3 of ref. 59). These 
results imply that while sidechain contributions are expected to be important for protein 
cooperativities, 54 their role has yet to be better elucidated. 

Finally, while the present study focuses on the behavior of small single-domain pro- 
teins, we hasten to emphasize that not all proteins have simple two-state folding/unfolding 
kinetics. Hence, the high cooperativity requirement deduced in the above analysis may 
not apply to other proteins. In fact, one distinct advantage of the energy landscape per- 
spective and self-contained polymer modeling is their ability to cover a wide spectrum 
of possible protein behavior under a unified physical framework. For instance, although 
common additive Go models are insufficient for simple two-state kinetics, they are useful 
for understanding real proteins with chevron rollovers. 7,8 ' 46 ' 60 And even calorimetrically 
non-cooperative models (see discussion in ref. 5) may prove to be helpful in rationalizing 
downhill protein folding 61 as well. 



Acknowledgments 

We thank Yawen Bai, Robert L. Baldwin, Alan Davidson, Julie Forman-Kay, Michael 
Levitt, Kevin Plaxco, Steve Plotkin, Boris Steipe, and Dev Thirumalai for helpful dis- 
cussions, and Kevin Plaxco for kindly providing ref. 30 before publication. This work 
was partially supported by the Canadian Institutes of Health Research (CIHR grant no. 



18 



MOP-15323), a Premier's Research Excellence Award from the Province of Ontario, and 
the Ontario Centre for Genomic Computing at the Hospital for Sick Children in Toronto. 
H. S. C. is a Canada Research Chair in Biochemistry. 



19 



References 

1. Chan HS. Matching speed and locality. Nature 1998;392:761-763. 

2. Plaxco KW, Simons KT, Ruczinski I, Baker D. Topology, stability, sequence, and 
length: Defining the determinants of two-state protein folding kinetics. Biochemistry 
2000;39:11177-11183. 

3. Chan HS, Kaya H, Shimizu S. Computational methods for protein folding: scaling 
a hierarchy of complexities. In: Jiang T, Xu Y, Zhang MQ, editors. Current Topics 
in Computational Molecular Biology. Cambridge, MA: The MIT Press; 2002. p 403- 
447. 

4. Chan HS. Modeling protein density of states: Additive hydrophobic effects are in- 
sufficient for calorimetric two-state cooperativity. Proteins 2000;40:543-571. 

5. Kaya H, Chan HS. Polymer principles of protein calorimetric two-state cooperativity. 
Proteins 2000;40:637-661 [Erratum: Proteins 2001;43:523]. 

6. Kaya H, Chan HS. Energetic components of cooperative protein folding. Phys Rev 
Lett 2000;85:4823-4826. 

7. Kaya H, Chan HS. Towards a consistent modeling of protein thermodynamic and 
kinetic cooperativity: How applicable is the transition state picture to folding and 
unfolding? J Mol Biol 2002;315:899-909. 

8. Kaya H, Chan HS. Solvation effects and driving forces for protein thermodynamic 
and kinetic cooperativity: How adequate is native-centric topological modeling? J 
Mol Biol 2003:326:911-931. 

9. Go N. Theoretical studies of protein folding. Annu Rev Biophys Bioeng 1983;12:183- 
210. 

10. Bryngelson JD, Wolynes PC Spin glasses and the statistical mechanics of protein 
folding. Proc Natl Acad Sci USA 1987;84:7524-7528. 

11. Taketomi H, Ueda Y, Go N. Studies on protein folding, unfolding and fluctuations 
by computer simulation. 1. The effect of specific amino acid sequence represented by 
specific inter- unit interactions. Int J Pept Protein Res 1975;7:445-459. 



20 



12. Clementi C, Nymeyer H, Onuchic JN. Topological and energetic factors: What de- 
termines the structural details of the transition state ensemble and "en-route" in- 
termediates for protein folding? An investigation for small globular proteins. J Mol 
Biol 2000;298:937-953. 

13. Li L, Shakhnovich EI. Constructing, verifying, and dissecting the folding transition 
state of chymotrypsin inhibitor 2 with all-atom simulations. Proc Natl Acad Sci USA 
2001;98:13014-13018. 

14. Portman JJ, Takada S, Wolynes PG. Microscopic theory of protein folding rates. II. 
Local reaction coordinates and chain dynamics. J Chem Phys 2001;114:5082-5096. 

15. Micheletti C, Lattanzi G, Maritan A. Elastic properties of proteins: Insight on 
the folding process and evolutionary selection of native structures. J Mol Biol 
2002;321:909-921. 

16. Linhananta A, Zhou Y. The role of sidechain packing and native contact interactions 
in folding: Discontinuous molecular dynamics folding simulations of an all-atom Go 
model of fragment B of Staphylococcal protein A. J Chem Phys 2002;117:8983-8995. 

17. Plaxco KW, Simons KT, Baker D. Contact order, transition state placement and 
the refolding rates of single domain proteins. J Mol Biol 1998;227:985-994. 

18. Matthews CR. Effect of point mutations on the folding of globular proteins. Methods 
Enzymol 1987;154:498-511. 

19. Cheung MS, Garcia AE, Onuchic JN. Protein folding mediated by solvation: Water 
expulsion and formation of the hydrophobic core occur after the structural collapse. 
Proc Natl Acad Sci USA 2002;99:685-690. 

20. Dill KA. Dominant forces in protein folding. Biochemistry 1990;29:7133-7155. 

21. Kolinski A, Galazka W, Skolnick J. On the origin of the cooperativity of protein 
folding: Implications from model simulations. Proteins 1996;26:271-287. 

22. Plotkin SS, Wang J, Wolynes PG. Statistical mechanics of a correlated energy land- 
scape model for protein folding funnels. J Chem Phys 1997;106:2932-2948. 



21 



23. Schindler T, Schmid FX. Thermodynamic properties of an extremely rapid protein 
folding reaction. Biochemistry 1996;35:16833-16842. 

24. Scalley ML, Baker D. Protein folding kinetics exhibit an Arrhenius temperature 
dependence when corrected for the temperature dependence of protein stability. Proc 
Natl Acad Sci USA 1997;94:10636-10640. 

25. Abkevich VI, Gutin AM, Shakhnovich EI. Free energy landscape for protein folding 
kinetics: Intermediates, traps, and multiple pathways in theory and lattice model 
simulations. J Chem Phys 1994;101:6052-6062. 

26. Chan HS. Modelling protein folding by Monte Carlo dynamics: Chevron plots, 
chevron rollover, and non-Arrhenius kinetics. In: Grassberger P, Barkema GT, 
Nadler W, editors. Monte Carlo Approach to Biopolymers and Protein Folding. 
Singapore: World Scientific; 1998. p 29-44. 

27. Chan HS, Dill KA. Protein folding in the landscape perspective: Chevron plots and 
non-Arrhenius kinetics. Proteins 1998;30:2-33. 

28. Northey JGB, Di Nardo A A, Davidson AR. Hydrophobic core packing in the SH3 
domain folding transition state. Nature Struct Biol 2002;9:126-130. 

29. Ventura S, Vega MC, Lacroix E, Angrand I, Spagnolo L, Serrano L. Conformational 
strain in the hydrophobic core and its implications for protein folding and design. 
Nature Struct Biol 2002;9:485-493. 

30. Jewett AI, Pande VS, Plaxco KW. Cooperativity, smooth energy landscapes and the 
origins of topology-dependent protein folding rates. J Mol Biol 2003;326:247-253. 

31. Koga N, Takada S. Roles of native topology and chain-length scaling in protein 
folding: A simulation study with a Go-like model. J Mol Biol 2001;313:171-180. 

32. Miller EJ, Fischer KF, Marqusee S. Experimental evaluation of topological param- 
eters determining protein-folding rates. Proc Natl Acad Sci USA 2002;99: 10359- 
10363. 

33. Kaya H, Chan HS. Contact order dependent protein folding rates: Kinetic conse- 
quences of a cooperative interplay between favorable nonlocal interactions and local 
conformational preferences. Proteins 2003; in press. 



22 



34. Chan HS, Shimizu S, Kaya H. Cooperativity principles in protein folding. Methods 
Enzymol 2003; in press. 

35. Segawa S, Sugihara M. Characterization of the transition state of lysozyme unfold- 
ing. I. Effect of protein-solvent interactions on the transition state. Biopolymers 
1984;23:2473-2488. 

36. Chen B-L, Baase WA, Schellman JA. Low-temperature unfolding of a mutant of 
phage T4 lysozyme. 2. Kinetic investigations. Biochemistry 1989;28:691-699. 

37. Jackson SE, Fersht AR. Folding of chymotrypsin inhibitor 2. I. Evidence for a two- 
state transition. Biochemistry 1991;30:10428-10435. 

38. Kuhlman B, Luisi DL, Evans PA, Raleigh DP. Global analysis of the effects of 
temperature and denaturant on the folding and unfolding kinetics of the N-terminal 
domain of the protein L9. J Mol Biol 1998;284:1661-1670. 

39. Makhatadze GI, Privalov PL. Energetics of protein structure. Adv Protein Chem 
1995;47:307-425. 

40. Lee CY, McCammon JA, Rossky PJ. The structure of liquid water at an extended 
hydrophobic surface. J Chem Phys 1984;80:4448-4455. 

41. Lum K, Chandler D, Weeks JD. Hydrophobicity at small and large length scales. J 
Phys Chem B 1999;103:4570-4577. 

42. Shimizu S, Chan HS. Origins of protein denatured states compactness and hydropho- 
bic clustering in aqueous urea: Inferences from nonpolar potentials of mean force. 
Proteins 2002;49:560-566. 

43. Baldwin RL. Folding intermediates in protein folding. BioEssays 1994;16:207-210. 

44. Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG. Funnels, pathways, and the 
energy landscape of protein folding: A synthesis. Proteins 1995;21:167-195. 

45. Dill KA, Chan HS. From Levinthal to pathways to funnels. Nature Struct Biol 
1997;4:10-19. 

46. Kaya H, Chan HS. Origins of chevron rollovers in non-two-state protein folding 
kinetics. Phys Rev Lett 2003; 258104. 



23 



47. Shimizu S, Chan HS. Anti-cooperativity and cooperativity in hydrophobic in- 
teractions: Three-body free energy landscapes and comparison with implicit- 
solvent potential functions for proteins. Proteins 2002;48:15-30 [Erratum: Proteins 
2002;49:294]. 

48. Makarov DE, Plaxco KW. The topomer search model: A simple, quantitative theory 
of two-state protein folding kinetics. Protein Sci 2003;12:17-26. 

49. Crippen GM, Chhajer M. Lattice models of protein folding permitting disordered 
native states. J Chem Phys 2002:116:2261-2268. 

50. Fan K, Wang J, Wang W. Folding of lattice protein chains with modified Go poten- 
tial. Eur Phys J B 2002;30:381-391. 

51. Clementi C, Garcia AE, Onuchic JN. Interplay among tertiary contacts, secondary 
structure formation and side-chain packing in the protein folding mechanism: All- 
atom representation study of protein L. J Mol Biol 2003;326:933-954. 

52. Pokarowski P, Kolinski A, Skolnick J. A minimal physically realistic protein-like 
lattice model: Designing an energy landscape that ensures all-or-none folding to a 
unique native state. Biophys J 2003;84:1518-1526. 

53. Englander SW, Mayne L, Bai Y, Sosnick TR. Hydrogen exchange: The modern 
legacy of Linderstr0m-Lang. Protein Sci 1997;6:1101-1109. 

54. Klimov DK, Thirumalai D. Cooperativity in protein folding: From lattice models 
with sidechains to real proteins. Fold Des 1998;3:127-139. 

55. Klimov DK, Thirumalai D. Is there a unique melting temperature for two-state 
proteins? J Comput Chem 2002;23:161-165. 

56. Zwanzig R. Simple model of protein folding kinetics. Proc Natl Acad Sci USA 
1995;92:9801-9804. 

57. Park BH, Huang ES, Levitt M. Factors affecting the ability of energy functions to 
discriminate correct from incorrect folds. J Mol Biol 1997;266:831-846. 

58. Dunbrack RL. Rotamer libraries in the 21st century. Curr Opin Struct Biol 
2002;12:431-440. 



24 



59. Li MS, Klimov DK, Thirumalai D. Folding in lattice models with side chains. Comput 
Phys Comm 2002;147:625-628. 

60. Galzitskaya OV, Garbuzynskiy SO, Ivankov DN, Finkelstein AV. Chain length is the 
main determinant of the folding rate for proteins with three-state folding kinetics. 
Proteins 2003;51:162-166. 

61. Garcia-Mira MM, Sadqi M, Fischer N, Sanchez-Ruiz JM, Muhoz V. Experimental 
identification of downhill protein folding. Science 2002;298:2191-2195. 



25 



Table I. Deviations from Arrhenius behavior in protein folding and unfolding. 



Protein name 


"(AC|) f /(ACi) u " 


T4 lysozyme mutant 3. 


-2.99 


CI2 b 


-2.95 


CspB c 


-9.0 


Protein L d 


-1.68 


NTL9 C 


-1.33 



a Chen et al. (ref. 36) 
b Jackson & Fersht (ref. 37) 
c Schindler & Schmid (ref. 23) 
d Scalley & Baker (ref. 24) 
c Kuhlman et al. (ref. 38) 
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Figure Captions 



Figure 1. Modeling a cooperative interplay between local conformational preference 
and protein core packing in a 55mer native-centric four-helix-bundle model. Certain 
native helices are shown as thick solid lines for illustrative purposes. Their sequence 
positions are indicated by the thin dotted lines depicting the rest of the full native struc- 
ture. Given a native helix is completely formed [the front right helix appearing in both 
(a) and (b) in this example], a favorable cooperative energy £ coop (< 0) is assigned if 
either (a) 4 or more of the 6 native contacts (thick dotted lines) between the given fully 
formed native helix and each of the two chain segments for the two flanking native helices 
are present, or (b) at least 2 of the 3 "diagonal" native contacts are present between the 
given fully formed native helix and the chain segment for the diagonally neighboring 
native helix, or both [(a) and (b)]. Thus, condition (a) requires at least 8 inter- helix 
nearest-neighbor native contacts, whereas condition (b) requires at least 2 inter-helix 
next-nearest-neighbor native contacts on the simple cubic lattice. It follows that the 
maximum total contribution from these cooperative energies is 4£ coop when all four na- 
tive helices are completely formed and correctly packed against one another. 

Figure 2. The overall thermodynamic cooperativity of a model protein is boosted 
by the many-body interactions described in Figure 1 and in the text. Upper panel: 
Heat capacity as a function of temperature for (i) the pairwise additive native-centric 
55mer model (equation 1), (ii) a model having the interactions in (i) plus the cooperative 
interaction scheme in Figure 1 with £ coop = —1.0 (equation 2), and (iii) a model with 
the interactions in (ii) plus an extra favorable energy of E gs = —2.0 for the ground-state 
conformation (equation 3). The heat capacity scans were obtained using Monte Carlo 
histogram techniques based on conformational sampling around each model's transition 
midpoint. 5-7 The inset (from ref. 7) shows the 55mer model ground-state structure with 
(nominally) hydrophobic and polar residues depicted respectively as filled and open cir- 
cles. Lower panel: Van't Hoff to calorimetric enthalpy AH v n/AH ca i ratios are given by 
K2 defined in ref. 5 (without empirical baseline subtractions) for three classes of 55mer 
models whose interaction schemes are parametrized by an £ coop variable, (a) As in (i) 
above plus an extra favorable energy of £ COO p for the ground-state conformation (equa- 
tion 4). (b) As in (ii) above but, instead of fixing £ coop = —1.0, a variable S coop is used 
for the helix packing contribution defined by Figure 1 and equation 2. (c) As in (b) plus 
an extra favorable energy of E gs = S coop for the ground-state conformation (equation 3). 
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Figure 3. Chevron plots for the 55mer models (i), (ii), and (hi) in Figure 2 are pro- 
vided by the negative natural logarithm of mean first passage time (MFPT) as functions 
of interaction strength e/k^T. The present Monte Carlo (MC) dynamics simulations use 
the same general procedure as that in ref. 7, now with a move set consisting of end flips 
(4.7%), corner flips (58.3%), crankshafts (27%) and rigid rotations (10%). Folding (filled 
symbols) starts from a randomly generated conformation. First passage time (FPT) for 
folding is the number of attempted MC moves needed to reach the ground-state con- 
formation. Unfolding (open symbols) starts from the ground-state conformation. Here 
unfolding FPT is the number of attempted MC moves needed to reach a conformation 
with fewer than 7 native contacts. Each plotted MFPT is averaged from 1,000 trajecto- 
ries. Solid and dashed curves through the data points are mere guides for the eye. The 
dotted V-shape for model (iii) is an hypothetical simple two-state chevron plot consis- 
tent with the e/k-QT dependence of thermodynamic stability as given by the free energy 
difference AG U between the ground state and the unfolded conformational ensemble with 
< 7 native contacts. 

Figure 4. Distribution of native contacts in the 55mer models. The total (maximum) 
number of spatial nearest- neighbor native contacts in the ground-state conformation is 
60 (diagonal contacts in Figure lb are not included in this accounting). Q is the frac- 
tional number of native contacts for a conformation, defined as the number of native 
contacts it contains divided by the maximum, (a) The free energy profiles of the models 

(i) , (ii) and (iii) defined in Figure 2 are given by the negative logarithmic distributions 
of Q. The profiles shown are for e/k-^T = —2.34, —2.11, and —2.0 respectively for (i), 

(ii) and (iii), near each model's transition midpoint, and were obtained by standard MC 
histogram techniques. 5-7 (b) Correlation between energy E and Q for models (ii) and 

(iii) . Dots indicate the existence of conformations with the given E and Q values. The 
open diamond and square mark the ground-state energies of —40.1 and —42.1 for models 
(ii) and (iii) respectively. 

Figure 5. Chevron plots for three different 27mer Go models and for their corre- 
sponding models with an extra favorable energy E gs assigned to the native (ground-state) 
structure (as shown), with e = — 1 (equation 5). Folding MFPT is independent of E gS: 
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and e has the same meaning as in Figure 3. Here each MFPT is averaged from 500 
trajectories. Folding (filled symbols) and unfolding (open symbols) simulations were 
performed as for Figure 3 except only local chain moves were used for the present 27mer 
MC dynamics simulation (no rigid rotations), and unfolding FPT is now defined by 
the time needed to reach a conformation with fewer than 4 native contacts from a given 
starting ground-state conformation with 28 native contacts. Solid curves are mere guides 
for the eye. (a) Unfolding chevron arms for eight models with different degrees of coop- 
erativity are shown; from top to bottom, E gs = (common additive Go model), E gs = 
—2, —4, —6, —8, —10, —12, and —14. (b) and (c) Unfolding chevron arms for the 
common additive Go model (E gs = 0, upper curves) are compared with the unfolding 
arms for E gs = — 14 (lower quasi-linear curves). The dotted V-shapes are hypothetical 
simple two-state chevron plots consistent with the e/ksT dependence of AG n between 
the ground state and the unfolded conformational ensemble with < 4 native contacts of 
a given model; AG U values are determined by standard histogram techniques 5-7 based 
on conformational sampling at e/k B T = —0.935, —0.917 and —0.917 for (a), (b) and (c) 
respectively (detailed data not shown). 

Figure 6. Comparing chevron plots for the three different models in Figure 5 with 
E gs = —14 shows that a model with lower native contact order (CO, as defined in ref. 17) 
tends to fold slightly faster. Here CO = 0.28, 0.40, and 0.51 for models (a), (b) and (c) 
respectively. 

Figure 7. Rationalizing non-Arrhenius protein folding and unfolding kinetics, (a) A 
linear fit of the logarithmic unfolding rates (vertical axis) of the 27mer models in Fig- 
ure 5a with — E gs > 6 to the expression shown (horizontal axis). For the data points plot- 
ted, the correlation coefficient r = 0.998. (b) Solid curve: an hypothetical hydrophobic- 
like temperature dependence of the model interaction strength —e^jk^T that drives fold- 
ing kinetics (left vertical scale). Dashed line: an hypothetical temperature dependence of 
the intrinsic conformational transition rate 1/A(T) relative to a reference rate 1/Aq (right 
vertical scale). More analytical details of this physical picture are provided in refs. 26 
and 27. (Note that a typographical error should be corrected in the caption for Figure 4 
of ref. 26: U AS = 5.2" should read ll AS = -5.2.") (c) Upper curves: Temperature- 
dependent folding and unfolding rates obtained by combining the solid curve in (b) and 
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data from Figure 5a (equations 8 and 9, temperature-dependent intrinsic conformational 
transition rate not taken into account). Lower curves: Temperature-dependent folding 
and unfolding rates obtained by combining the solid curve and dashed line in (b) and data 
from Figure 5a (equation 10, temperature-dependent intrinsic conformational transition 
rate taken into account), (d) Included for comparison are the temperature-dependent 
CI2 folding and unfolding rates at 25°C and pH 6.3 from the experiments of Jackson and 
Fersht (adapted from Figure 4 of ref. 37). The upper and lower folding curves were for 
M and 1.5 M GdnHCl respectively, the unfolding curve was measured at M GdnHCl. 

Figure 8. Schematics of hypothetical and proposed energy landscapes for protein 
folding, (a) A golf-course or "Levinthal" landscape, (b) A funnel landscape, (c) A 
"near-Levinthal" scenario. 
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